Effective Phrase Prediction
نویسندگان
چکیده
Autocompletion is a widely deployed facility in systems that require user input. Having the system complete a partially typed “word” can save user time and effort. In this paper, we study the problem of autocompletion not just at the level of a single “word”, but at the level of a multi-word “phrase”. There are two main challenges: one is that the number of phrases (both the number possible and the number actually observed in a corpus) is combinatorially larger than the number of words; the second is that a “phrase”, unlike a “word”, does not have a well-defined boundary, so that the autocompletion system has to decide not just what to predict, but also how far. We introduce a FussyTree structure to address the first challenge and the concept of a significant phrase to address the second. We develop a probabilistically driven multiple completion choice model, and exploit features such as frequency distributions to improve the quality of our suffix completions. We experimentally demonstrate the practicability and value of our technique for an email composition application and show that we can save approximately a fifth of the keystrokes typed.
منابع مشابه
Learning Rules for Chinese Prosodic Phrase Prediction
This paper describes a rule-learning approach towards Chinese prosodic phrase prediction for TTS systems. Firstly, we prepared a speech corpus having about 3000 sentences and manually labelled the sentences with two-level prosodic structure. Secondly, candidate features related to prosodic phrasing and the corresponding prosodic boundary labels are extracted from the corpus text to establish an...
متن کاملUsing multiple linguistic features for Mandarin phrase break prediction in maximum-entropy classification framework
We model Mandarin phrase break prediction as a classification problem with three level prosodic structures and apply conditional maximum entropy classification to this problem. We acquire multiple levels of linguistic knowledge from an annotated corpus to become well-integrated features for maximum entropy framework. Five kinds of features were used to represent various linguistic constraints i...
متن کاملIncorporating second-order information into two-step major phrase break prediction for Korean
In this paper, we present a new phrase break prediction method that integrates second-order information into general maximum entropy model. The phrase break prediction problem was mapped into a classification problem in our research. The features we used for the prediction of phrase breaks are of several layers such as local features (part-of-speech (POS) tags, a lexicon, lengths of eojeols and...
متن کاملProsody prediction for speech synthesis using transformational rule-based learning
Prediction of symbolic prosodic labels (pitch accents and phrase structure) is an important step in generating natural synthetic speech. This paper investigates a new automatically trainable procedure for combined accent and phrase prediction based on transformational rule-based learning. Experimental results on a radio news corpus show that accent prediction bene ts from phrase structure, but ...
متن کاملTODO: This is a placeholder. Final title will be filled later
In this paper, we present a new phrase break prediction method that integrates second-order information into general maximum entropy model. The phrase break prediction problem was mapped into a classification problem in our research. The features we used for the prediction of phrase breaks are of several layers such as local features (part-of-speech (POS) tags, a lexicon, lengths of eojeols and...
متن کاملSignificance of word-terminal syllables for prediction of phrase breaks in text-to-speech systems for Indian languages
Phrase break prediction is very important for speech synthesis. Traditional methods of phrase break prediction have used linguistic resources like part-of-speech (POS) sequence information for modeling these breaks. In the context of Indian languages, we propose to look at syllable level features and explore the use of word-terminal syllables to model phrase breaks. We hypothesize that these te...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007